Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 44837 |
| Missing cells | 40622 |
| Missing cells (%) | 6.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 5.1 MiB |
| Average record size in memory | 120.0 B |
Variable types
| Numeric | 9 |
|---|---|
| Text | 6 |
popularity is highly overall correlated with vote_count | High correlation |
vote_count is highly overall correlated with popularity | High correlation |
id_collection has 40375 (90.0%) missing values | Missing |
popularity is highly skewed (γ1 = 29.46841147) | Skewed |
id has unique values | Unique |
budget has 35999 (80.3%) zeros | Zeros |
runtime has 1506 (3.4%) zeros | Zeros |
vote_average has 2880 (6.4%) zeros | Zeros |
vote_count has 2784 (6.2%) zeros | Zeros |
Reproduction
| Analysis started | 2023-06-10 22:22:31.064524 |
|---|---|
| Analysis finished | 2023-06-10 22:23:01.073202 |
| Duration | 30.01 seconds |
| Software version | ydata-profiling vv4.2.0 |
| Download configuration | config.json |
id
Real number (ℝ)
| Distinct | 44837 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 106925.11 |
| Minimum | 2 |
|---|---|
| Maximum | 469172 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 350.4 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 5205.8 |
| Q1 | 26203 |
| median | 59115 |
| Q3 | 153854 |
| 95-th percentile | 354129 |
| Maximum | 469172 |
| Range | 469170 |
| Interquartile range (IQR) | 127651 |
Descriptive statistics
| Standard deviation | 111095.38 |
|---|---|
| Coefficient of variation (CV) | 1.0390017 |
| Kurtosis | 0.57769623 |
| Mean | 106925.11 |
| Median Absolute Deviation (MAD) | 43834 |
| Skewness | 1.2884833 |
| Sum | 4.7942013 × 109 |
| Variance | 1.2342183 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 862 | 1 | < 0.1% |
| 182424 | 1 | < 0.1% |
| 87321 | 1 | < 0.1% |
| 291613 | 1 | < 0.1% |
| 103675 | 1 | < 0.1% |
| 171424 | 1 | < 0.1% |
| 337210 | 1 | < 0.1% |
| 113040 | 1 | < 0.1% |
| 244418 | 1 | < 0.1% |
| 114953 | 1 | < 0.1% |
| Other values (44827) | 44827 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 3 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 |
| Value | Count | Frequency (%) |
| 469172 | 1 | |
| 468343 | 1 | |
| 462788 | 1 | |
| 461955 | 1 | |
| 461805 | 1 | |
| 461615 | 1 | |
| 461533 | 1 | |
| 461089 | 1 | |
| 460870 | 1 | |
| 460822 | 1 |
title
Text
| Distinct | 41740 |
|---|---|
| Distinct (%) | 93.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 350.4 KiB |
Length
| Max length | 105 |
|---|---|
| Median length | 79 |
| Mean length | 16.693557 |
| Min length | 1 |
Characters and Unicode
| Total characters | 748489 |
|---|---|
| Distinct characters | 285 |
| Distinct categories | 17 ? |
| Distinct scripts | 7 ? |
| Distinct blocks | 12 ? |
Unique
| Unique | 39470 ? |
|---|---|
| Unique (%) | 88.0% |
Sample
| 1st row | Toy Story |
|---|---|
| 2nd row | Jumanji |
| 3rd row | Grumpier Old Men |
| 4th row | Waiting to Exhale |
| 5th row | Father of the Bride Part II |
| Value | Count | Frequency (%) |
| the | 14413 | 10.7% |
| of | 4870 | 3.6% |
| a | 2203 | 1.6% |
| in | 1676 | 1.2% |
| and | 1607 | 1.2% |
| to | 1039 | 0.8% |
| 749 | 0.6% | |
| man | 661 | 0.5% |
| love | 655 | 0.5% |
| for | 594 | 0.4% |
| Other values (24114) | 106056 |
Most occurring characters
| Value | Count | Frequency (%) |
| 89708 | 12.0% | |
| e | 75366 | 10.1% |
| a | 48337 | 6.5% |
| o | 45113 | 6.0% |
| n | 40322 | 5.4% |
| r | 39529 | 5.3% |
| i | 39209 | 5.2% |
| t | 36257 | 4.8% |
| s | 29154 | 3.9% |
| h | 28210 | 3.8% |
| Other values (275) | 277284 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 527495 | |
| Uppercase Letter | 115863 | 15.5% |
| Space Separator | 89708 | 12.0% |
| Other Punctuation | 10349 | 1.4% |
| Decimal Number | 3796 | 0.5% |
| Dash Punctuation | 967 | 0.1% |
| Close Punctuation | 86 | < 0.1% |
| Open Punctuation | 84 | < 0.1% |
| Final Punctuation | 38 | < 0.1% |
| Other Letter | 25 | < 0.1% |
| Other values (7) | 78 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 75366 | |
| a | 48337 | |
| o | 45113 | 8.6% |
| n | 40322 | 7.6% |
| r | 39529 | 7.5% |
| i | 39209 | 7.4% |
| t | 36257 | 6.9% |
| s | 29154 | 5.5% |
| h | 28210 | 5.3% |
| l | 25609 | 4.9% |
| Other values (121) | 120389 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 15833 | |
| S | 10214 | 8.8% |
| M | 7938 | 6.9% |
| B | 7567 | 6.5% |
| C | 7084 | 6.1% |
| A | 6693 | 5.8% |
| D | 6249 | 5.4% |
| L | 5806 | 5.0% |
| H | 5106 | 4.4% |
| W | 5096 | 4.4% |
| Other values (63) | 38277 |
Other Letter
| Value | Count | Frequency (%) |
| ه | 2 | 8.0% |
| ی | 2 | 8.0% |
| چ | 2 | 8.0% |
| ک | 2 | 8.0% |
| 傳 | 1 | 4.0% |
| ج | 1 | 4.0% |
| 空 | 1 | 4.0% |
| 時 | 1 | 4.0% |
| 狗 | 1 | 4.0% |
| ا | 1 | 4.0% |
| Other values (11) | 11 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 3667 | |
| ' | 2471 | |
| . | 1585 | |
| , | 1112 | 10.7% |
| ! | 642 | 6.2% |
| & | 454 | 4.4% |
| ? | 265 | 2.6% |
| / | 77 | 0.7% |
| * | 19 | 0.2% |
| # | 13 | 0.1% |
| Other values (8) | 44 | 0.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 851 | |
| 1 | 691 | |
| 0 | 608 | |
| 3 | 473 | |
| 9 | 228 | 6.0% |
| 4 | 225 | 5.9% |
| 5 | 219 | 5.8% |
| 7 | 190 | 5.0% |
| 8 | 157 | 4.1% |
| 6 | 154 | 4.1% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 17 | |
| × | 2 | 8.7% |
| = | 1 | 4.3% |
| ∞ | 1 | 4.3% |
| → | 1 | 4.3% |
| − | 1 | 4.3% |
Other Number
| Value | Count | Frequency (%) |
| ½ | 12 | |
| ² | 3 | 15.8% |
| ³ | 2 | 10.5% |
| ⅓ | 1 | 5.3% |
| ⁴ | 1 | 5.3% |
Other Symbol
| Value | Count | Frequency (%) |
| ° | 3 | |
| ☆ | 2 | |
| ™ | 1 | 12.5% |
| ♡ | 1 | 12.5% |
| № | 1 | 12.5% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 18 | |
| ¢ | 2 | 9.5% |
| £ | 1 | 4.8% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 952 | |
| – | 15 | 1.6% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 81 | |
| ] | 5 | 5.8% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 79 | |
| [ | 5 | 6.0% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 37 | |
| ” | 1 | 2.6% |
Initial Punctuation
| Value | Count | Frequency (%) |
| ‘ | 1 | |
| “ | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 89708 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 3 |
Format
| Value | Count | Frequency (%) |
| | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 642851 | |
| Common | 105106 | 14.0% |
| Cyrillic | 338 | < 0.1% |
| Greek | 170 | < 0.1% |
| Arabic | 11 | < 0.1% |
| Katakana | 8 | < 0.1% |
| Han | 5 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 75366 | 11.7% |
| a | 48337 | 7.5% |
| o | 45113 | 7.0% |
| n | 40322 | 6.3% |
| r | 39529 | 6.1% |
| i | 39209 | 6.1% |
| t | 36257 | 5.6% |
| s | 29154 | 4.5% |
| h | 28210 | 4.4% |
| l | 25609 | 4.0% |
| Other values (106) | 235745 |
Common
| Value | Count | Frequency (%) |
| 89708 | ||
| : | 3667 | 3.5% |
| ' | 2471 | 2.4% |
| . | 1585 | 1.5% |
| , | 1112 | 1.1% |
| - | 952 | 0.9% |
| 2 | 851 | 0.8% |
| 1 | 691 | 0.7% |
| ! | 642 | 0.6% |
| 0 | 608 | 0.6% |
| Other values (50) | 2819 | 2.7% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 32 | 9.5% |
| е | 32 | 9.5% |
| а | 27 | 8.0% |
| н | 23 | 6.8% |
| и | 22 | 6.5% |
| р | 22 | 6.5% |
| к | 17 | 5.0% |
| с | 15 | 4.4% |
| т | 14 | 4.1% |
| в | 13 | 3.8% |
| Other values (37) | 121 |
Greek
| Value | Count | Frequency (%) |
| α | 20 | 11.8% |
| ο | 14 | 8.2% |
| ι | 14 | 8.2% |
| τ | 9 | 5.3% |
| ά | 8 | 4.7% |
| λ | 8 | 4.7% |
| ρ | 8 | 4.7% |
| ν | 7 | 4.1% |
| ε | 6 | 3.5% |
| η | 6 | 3.5% |
| Other values (32) | 70 |
Katakana
| Value | Count | Frequency (%) |
| ァ | 1 | |
| ポ | 1 | |
| ィ | 1 | |
| テ | 1 | |
| ス | 1 | |
| タ | 1 | |
| ン | 1 | |
| フ | 1 |
Arabic
| Value | Count | Frequency (%) |
| ه | 2 | |
| ی | 2 | |
| چ | 2 | |
| ک | 2 | |
| ج | 1 | |
| ا | 1 | |
| س | 1 |
Han
| Value | Count | Frequency (%) |
| 傳 | 1 | |
| 空 | 1 | |
| 時 | 1 | |
| 狗 | 1 | |
| 貓 | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 746957 | |
| None | 1099 | 0.1% |
| Cyrillic | 338 | < 0.1% |
| Punctuation | 62 | < 0.1% |
| Arabic | 11 | < 0.1% |
| Katakana | 8 | < 0.1% |
| CJK | 5 | < 0.1% |
| Misc Symbols | 3 | < 0.1% |
| Letterlike Symbols | 2 | < 0.1% |
| Math Operators | 2 | < 0.1% |
| Other values (2) | 2 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 89708 | 12.0% | |
| e | 75366 | 10.1% |
| a | 48337 | 6.5% |
| o | 45113 | 6.0% |
| n | 40322 | 5.4% |
| r | 39529 | 5.3% |
| i | 39209 | 5.2% |
| t | 36257 | 4.9% |
| s | 29154 | 3.9% |
| h | 28210 | 3.8% |
| Other values (76) | 275752 |
None
| Value | Count | Frequency (%) |
| é | 212 | |
| ä | 126 | 11.5% |
| ö | 55 | 5.0% |
| è | 52 | 4.7% |
| ô | 44 | 4.0% |
| ü | 38 | 3.5% |
| ó | 36 | 3.3% |
| ı | 35 | 3.2% |
| á | 34 | 3.1% |
| í | 32 | 2.9% |
| Other values (107) | 435 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 37 | |
| – | 15 | |
| … | 5 | 8.1% |
| | 2 | 3.2% |
| ‘ | 1 | 1.6% |
| “ | 1 | 1.6% |
| ” | 1 | 1.6% |
Cyrillic
| Value | Count | Frequency (%) |
| о | 32 | 9.5% |
| е | 32 | 9.5% |
| а | 27 | 8.0% |
| н | 23 | 6.8% |
| и | 22 | 6.5% |
| р | 22 | 6.5% |
| к | 17 | 5.0% |
| с | 15 | 4.4% |
| т | 14 | 4.1% |
| в | 13 | 3.8% |
| Other values (37) | 121 |
Arabic
| Value | Count | Frequency (%) |
| ه | 2 | |
| ی | 2 | |
| چ | 2 | |
| ک | 2 | |
| ج | 1 | |
| ا | 1 | |
| س | 1 |
Misc Symbols
| Value | Count | Frequency (%) |
| ☆ | 2 | |
| ♡ | 1 |
Letterlike Symbols
| Value | Count | Frequency (%) |
| ™ | 1 | |
| № | 1 |
CJK
| Value | Count | Frequency (%) |
| 傳 | 1 | |
| 空 | 1 | |
| 時 | 1 | |
| 狗 | 1 | |
| 貓 | 1 |
Number Forms
| Value | Count | Frequency (%) |
| ⅓ | 1 |
Katakana
| Value | Count | Frequency (%) |
| ァ | 1 | |
| ポ | 1 | |
| ィ | 1 | |
| テ | 1 | |
| ス | 1 | |
| タ | 1 | |
| ン | 1 | |
| フ | 1 |
Math Operators
| Value | Count | Frequency (%) |
| ∞ | 1 | |
| − | 1 |
Arrows
| Value | Count | Frequency (%) |
| → | 1 |
budget
Real number (ℝ)
| Distinct | 1215 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4273734.5 |
| Minimum | 0 |
|---|---|
| Maximum | 3.8 × 108 |
| Zeros | 35999 |
| Zeros (%) | 80.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 350.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 25000000 |
| Maximum | 3.8 × 108 |
| Range | 3.8 × 108 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 17532226 |
|---|---|
| Coefficient of variation (CV) | 4.1023198 |
| Kurtosis | 65.92765 |
| Mean | 4273734.5 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 7.0821562 |
| Sum | 1.9162143 × 1011 |
| Variance | 3.0737894 × 1014 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 35999 | |
| 5000000 | 285 | 0.6% |
| 10000000 | 258 | 0.6% |
| 20000000 | 242 | 0.5% |
| 2000000 | 237 | 0.5% |
| 15000000 | 225 | 0.5% |
| 3000000 | 223 | 0.5% |
| 25000000 | 206 | 0.5% |
| 1000000 | 197 | 0.4% |
| 30000000 | 189 | 0.4% |
| Other values (1205) | 6776 | 15.1% |
| Value | Count | Frequency (%) |
| 0 | 35999 | |
| 1 | 25 | 0.1% |
| 2 | 13 | < 0.1% |
| 3 | 9 | < 0.1% |
| 4 | 7 | < 0.1% |
| 5 | 7 | < 0.1% |
| 6 | 5 | < 0.1% |
| 7 | 4 | < 0.1% |
| 8 | 5 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 380000000 | 1 | < 0.1% |
| 300000000 | 1 | < 0.1% |
| 280000000 | 1 | < 0.1% |
| 270000000 | 1 | < 0.1% |
| 260000000 | 3 | < 0.1% |
| 258000000 | 1 | < 0.1% |
| 255000000 | 1 | < 0.1% |
| 250000000 | 10 | |
| 245000000 | 2 | < 0.1% |
| 237000000 | 1 | < 0.1% |
| Distinct | 89 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 11 |
| Missing (%) | < 0.1% |
| Memory size | 350.4 KiB |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 89652 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 17 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | en |
|---|---|
| 2nd row | en |
| 3rd row | en |
| 4th row | en |
| 5th row | en |
| Value | Count | Frequency (%) |
| en | 31894 | |
| fr | 2385 | 5.3% |
| it | 1515 | 3.4% |
| ja | 1340 | 3.0% |
| de | 1063 | 2.4% |
| es | 976 | 2.2% |
| ru | 782 | 1.7% |
| hi | 503 | 1.1% |
| ko | 442 | 1.0% |
| zh | 403 | 0.9% |
| Other values (79) | 3523 | 7.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 34187 | |
| n | 32592 | |
| r | 3538 | 3.9% |
| f | 2770 | 3.1% |
| i | 2361 | 2.6% |
| t | 2228 | 2.5% |
| a | 1818 | 2.0% |
| s | 1626 | 1.8% |
| j | 1341 | 1.5% |
| d | 1305 | 1.5% |
| Other values (16) | 5886 | 6.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 89652 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 34187 | |
| n | 32592 | |
| r | 3538 | 3.9% |
| f | 2770 | 3.1% |
| i | 2361 | 2.6% |
| t | 2228 | 2.5% |
| a | 1818 | 2.0% |
| s | 1626 | 1.8% |
| j | 1341 | 1.5% |
| d | 1305 | 1.5% |
| Other values (16) | 5886 | 6.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 89652 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 34187 | |
| n | 32592 | |
| r | 3538 | 3.9% |
| f | 2770 | 3.1% |
| i | 2361 | 2.6% |
| t | 2228 | 2.5% |
| a | 1818 | 2.0% |
| s | 1626 | 1.8% |
| j | 1341 | 1.5% |
| d | 1305 | 1.5% |
| Other values (16) | 5886 | 6.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 89652 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 34187 | |
| n | 32592 | |
| r | 3538 | 3.9% |
| f | 2770 | 3.1% |
| i | 2361 | 2.6% |
| t | 2228 | 2.5% |
| a | 1818 | 2.0% |
| s | 1626 | 1.8% |
| j | 1341 | 1.5% |
| d | 1305 | 1.5% |
| Other values (16) | 5886 | 6.6% |
popularity
Real number (ℝ)
HIGH CORRELATION  SKEWED 
| Distinct | 43251 |
|---|---|
| Distinct (%) | 96.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.9368967 |
| Minimum | 0 |
|---|---|
| Maximum | 547.4883 |
| Zeros | 39 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 350.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.0214034 |
| Q1 | 0.393975 |
| median | 1.138972 |
| Q3 | 3.732537 |
| 95-th percentile | 11.073024 |
| Maximum | 547.4883 |
| Range | 547.4883 |
| Interquartile range (IQR) | 3.338562 |
Descriptive statistics
| Standard deviation | 6.0118511 |
|---|---|
| Coefficient of variation (CV) | 2.047008 |
| Kurtosis | 1943.3744 |
| Mean | 2.9368967 |
| Median Absolute Deviation (MAD) | 0.975422 |
| Skewness | 29.468411 |
| Sum | 131681.64 |
| Variance | 36.142354 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 × 10-6 | 54 | 0.1% |
| 0.000308 | 42 | 0.1% |
| 0 | 39 | 0.1% |
| 0.00022 | 39 | 0.1% |
| 0.001177 | 38 | 0.1% |
| 0.000578 | 38 | 0.1% |
| 0.000844 | 38 | 0.1% |
| 0.002001 | 27 | 0.1% |
| 0.003013 | 21 | < 0.1% |
| 0.00353 | 19 | < 0.1% |
| Other values (43241) | 44482 |
| Value | Count | Frequency (%) |
| 0 | 39 | |
| 1 × 10-6 | 54 | |
| 2 × 10-6 | 5 | < 0.1% |
| 3 × 10-6 | 5 | < 0.1% |
| 4 × 10-6 | 5 | < 0.1% |
| 5 × 10-6 | 1 | < 0.1% |
| 6 × 10-6 | 2 | < 0.1% |
| 7 × 10-6 | 1 | < 0.1% |
| 8 × 10-6 | 6 | < 0.1% |
| 9 × 10-6 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 547.488298 | 1 | |
| 294.337037 | 1 | |
| 287.253654 | 1 | |
| 228.032744 | 1 | |
| 213.849907 | 1 | |
| 187.860492 | 1 | |
| 185.330992 | 1 | |
| 185.070892 | 1 | |
| 183.870374 | 1 | |
| 154.801009 | 1 |
runtime
Real number (ℝ)
| Distinct | 353 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 236 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 94.349432 |
| Minimum | 0 |
|---|---|
| Maximum | 1256 |
| Zeros | 1506 |
| Zeros (%) | 3.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 350.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 13 |
| Q1 | 85 |
| median | 95 |
| Q3 | 107 |
| 95-th percentile | 138 |
| Maximum | 1256 |
| Range | 1256 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 38.238363 |
|---|---|
| Coefficient of variation (CV) | 0.40528451 |
| Kurtosis | 95.911275 |
| Mean | 94.349432 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 4.5804385 |
| Sum | 4208079 |
| Variance | 1462.1724 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 2521 | 5.6% |
| 0 | 1506 | 3.4% |
| 100 | 1460 | 3.3% |
| 95 | 1391 | 3.1% |
| 93 | 1197 | 2.7% |
| 96 | 1096 | 2.4% |
| 92 | 1065 | 2.4% |
| 94 | 1055 | 2.4% |
| 91 | 1045 | 2.3% |
| 97 | 1015 | 2.3% |
| Other values (343) | 31250 |
| Value | Count | Frequency (%) |
| 0 | 1506 | |
| 1 | 100 | 0.2% |
| 2 | 29 | 0.1% |
| 3 | 39 | 0.1% |
| 4 | 43 | 0.1% |
| 5 | 49 | 0.1% |
| 6 | 70 | 0.2% |
| 7 | 98 | 0.2% |
| 8 | 72 | 0.2% |
| 9 | 59 | 0.1% |
| Value | Count | Frequency (%) |
| 1256 | 1 | |
| 1140 | 2 | |
| 931 | 1 | |
| 925 | 1 | |
| 900 | 1 | |
| 877 | 1 | |
| 874 | 1 | |
| 840 | 2 | |
| 780 | 1 | |
| 720 | 1 |
vote_average
Real number (ℝ)
| Distinct | 92 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.6280884 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 2880 |
| Zeros (%) | 6.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 350.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 5 |
| median | 6 |
| Q3 | 6.8 |
| 95-th percentile | 7.8 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 1.8 |
Descriptive statistics
| Standard deviation | 1.9086658 |
|---|---|
| Coefficient of variation (CV) | 0.33913217 |
| Kurtosis | 2.5747989 |
| Mean | 5.6280884 |
| Median Absolute Deviation (MAD) | 0.9 |
| Skewness | -1.5297195 |
| Sum | 252346.6 |
| Variance | 3.6430053 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2880 | 6.4% |
| 6 | 2422 | 5.4% |
| 5 | 1962 | 4.4% |
| 7 | 1855 | 4.1% |
| 6.5 | 1705 | 3.8% |
| 6.3 | 1589 | 3.5% |
| 5.5 | 1366 | 3.0% |
| 5.8 | 1352 | 3.0% |
| 6.4 | 1342 | 3.0% |
| 6.7 | 1328 | 3.0% |
| Other values (82) | 27036 |
| Value | Count | Frequency (%) |
| 0 | 2880 | |
| 0.5 | 13 | < 0.1% |
| 0.7 | 1 | < 0.1% |
| 1 | 100 | 0.2% |
| 1.1 | 1 | < 0.1% |
| 1.2 | 4 | < 0.1% |
| 1.3 | 12 | < 0.1% |
| 1.4 | 5 | < 0.1% |
| 1.5 | 30 | 0.1% |
| 1.6 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 10 | 179 | |
| 9.8 | 1 | < 0.1% |
| 9.6 | 1 | < 0.1% |
| 9.5 | 18 | < 0.1% |
| 9.4 | 3 | < 0.1% |
| 9.3 | 18 | < 0.1% |
| 9.2 | 4 | < 0.1% |
| 9.1 | 2 | < 0.1% |
| 9 | 154 | |
| 8.9 | 7 | < 0.1% |
vote_count
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 1820 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 111.20153 |
| Minimum | 0 |
|---|---|
| Maximum | 14075 |
| Zeros | 2784 |
| Zeros (%) | 6.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 350.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 10 |
| Q3 | 35 |
| 95-th percentile | 441 |
| Maximum | 14075 |
| Range | 14075 |
| Interquartile range (IQR) | 32 |
Descriptive statistics
| Standard deviation | 494.55158 |
|---|---|
| Coefficient of variation (CV) | 4.4473451 |
| Kurtosis | 149.17758 |
| Mean | 111.20153 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 10.380647 |
| Sum | 4985943 |
| Variance | 244581.27 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 3183 | 7.1% |
| 2 | 3071 | 6.8% |
| 0 | 2784 | 6.2% |
| 3 | 2728 | 6.1% |
| 4 | 2431 | 5.4% |
| 5 | 2072 | 4.6% |
| 6 | 1720 | 3.8% |
| 7 | 1541 | 3.4% |
| 8 | 1350 | 3.0% |
| 9 | 1182 | 2.6% |
| Other values (1810) | 22775 |
| Value | Count | Frequency (%) |
| 0 | 2784 | |
| 1 | 3183 | |
| 2 | 3071 | |
| 3 | 2728 | |
| 4 | 2431 | |
| 5 | 2072 | |
| 6 | 1720 | |
| 7 | 1541 | |
| 8 | 1350 | |
| 9 | 1182 | 2.6% |
| Value | Count | Frequency (%) |
| 14075 | 1 | |
| 12269 | 1 | |
| 12114 | 1 | |
| 12000 | 1 | |
| 11444 | 1 | |
| 11187 | 1 | |
| 10297 | 1 | |
| 10014 | 1 | |
| 9678 | 1 | |
| 9634 | 1 |
id_collection
Real number (ℝ)
| Distinct | 1690 |
|---|---|
| Distinct (%) | 37.9% |
| Missing | 40375 |
| Missing (%) | 90.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 183639.64 |
| Minimum | 10 |
|---|---|
| Maximum | 480160 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 350.4 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 2704 |
| Q1 | 85960 |
| median | 141286 |
| Q3 | 293196 |
| 95-th percentile | 439014.45 |
| Maximum | 480160 |
| Range | 480150 |
| Interquartile range (IQR) | 207236 |
Descriptive statistics
| Standard deviation | 141536.13 |
|---|---|
| Coefficient of variation (CV) | 0.77072756 |
| Kurtosis | -0.92324343 |
| Mean | 183639.64 |
| Median Absolute Deviation (MAD) | 104025 |
| Skewness | 0.53627669 |
| Sum | 8.1940008 × 108 |
| Variance | 2.0032477 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 415931 | 29 | 0.1% |
| 645 | 26 | 0.1% |
| 96887 | 26 | 0.1% |
| 421566 | 26 | 0.1% |
| 34055 | 25 | 0.1% |
| 37261 | 22 | < 0.1% |
| 413661 | 21 | < 0.1% |
| 374509 | 16 | < 0.1% |
| 425164 | 15 | < 0.1% |
| 148324 | 15 | < 0.1% |
| Other values (1680) | 4241 | 9.5% |
| (Missing) | 40375 |
| Value | Count | Frequency (%) |
| 10 | 8 | |
| 84 | 4 | |
| 119 | 3 | < 0.1% |
| 131 | 3 | < 0.1% |
| 151 | 6 | |
| 230 | 3 | < 0.1% |
| 263 | 3 | < 0.1% |
| 264 | 3 | < 0.1% |
| 295 | 5 | |
| 304 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 480160 | 1 | < 0.1% |
| 480071 | 1 | < 0.1% |
| 479971 | 1 | < 0.1% |
| 479888 | 2 | < 0.1% |
| 479692 | 2 | < 0.1% |
| 479549 | 1 | < 0.1% |
| 479319 | 13 | |
| 478947 | 2 | < 0.1% |
| 478628 | 12 | |
| 478545 | 1 | < 0.1% |
name_genres
Text
| Distinct | 4035 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 350.4 KiB |
Length
| Max length | 98 |
|---|---|
| Median length | 84 |
| Mean length | 21.655619 |
| Min length | 2 |
Characters and Unicode
| Total characters | 970973 |
|---|---|
| Distinct characters | 43 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2343 ? |
|---|---|
| Unique (%) | 5.2% |
Sample
| 1st row | ['Animation', 'Comedy', 'Family'] |
|---|---|
| 2nd row | ['Adventure', 'Fantasy', 'Family'] |
| 3rd row | ['Romance', 'Comedy'] |
| 4th row | ['Comedy', 'Drama', 'Romance'] |
| 5th row | ['Comedy'] |
| Value | Count | Frequency (%) |
| drama | 20074 | |
| comedy | 12986 | |
| thriller | 7564 | 7.9% |
| romance | 6663 | 6.9% |
| action | 6529 | 6.8% |
| horror | 4624 | 4.8% |
| crime | 4274 | 4.4% |
| documentary | 3868 | 4.0% |
| adventure | 3477 | 3.6% |
| science | 3016 | 3.1% |
| Other values (37) | 23182 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 180280 | |
| r | 68492 | 7.1% |
| a | 61215 | 6.3% |
| e | 55235 | 5.7% |
| m | 52509 | 5.4% |
| 51420 | 5.3% | |
| o | 47965 | 4.9% |
| , | 47635 | 4.9% |
| [ | 44837 | 4.6% |
| ] | 44837 | 4.6% |
| Other values (33) | 316548 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 507278 | |
| Other Punctuation | 227915 | |
| Uppercase Letter | 94686 | 9.8% |
| Space Separator | 51420 | 5.3% |
| Open Punctuation | 44837 | 4.6% |
| Close Punctuation | 44837 | 4.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 68492 | |
| a | 61215 | |
| e | 55235 | |
| m | 52509 | |
| o | 47965 | |
| i | 39295 | |
| n | 35304 | |
| y | 28158 | |
| c | 27704 | |
| t | 25957 | 5.1% |
| Other values (12) | 65444 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 23942 | |
| C | 17263 | |
| A | 11909 | |
| F | 9640 | |
| T | 8322 | 8.8% |
| R | 6665 | 7.0% |
| H | 6009 | 6.3% |
| M | 4793 | 5.1% |
| S | 3020 | 3.2% |
| W | 2355 | 2.5% |
| Other values (6) | 768 | 0.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 180280 | |
| , | 47635 | 20.9% |
Space Separator
| Value | Count | Frequency (%) |
| 51420 |
Open Punctuation
| Value | Count | Frequency (%) |
| [ | 44837 |
Close Punctuation
| Value | Count | Frequency (%) |
| ] | 44837 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 601964 | |
| Common | 369009 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 68492 | |
| a | 61215 | 10.2% |
| e | 55235 | 9.2% |
| m | 52509 | 8.7% |
| o | 47965 | 8.0% |
| i | 39295 | 6.5% |
| n | 35304 | 5.9% |
| y | 28158 | 4.7% |
| c | 27704 | 4.6% |
| t | 25957 | 4.3% |
| Other values (28) | 160130 |
Common
| Value | Count | Frequency (%) |
| ' | 180280 | |
| 51420 | 13.9% | |
| , | 47635 | 12.9% |
| [ | 44837 | 12.2% |
| ] | 44837 | 12.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 970973 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| ' | 180280 | |
| r | 68492 | 7.1% |
| a | 61215 | 6.3% |
| e | 55235 | 5.7% |
| m | 52509 | 5.4% |
| 51420 | 5.3% | |
| o | 47965 | 4.9% |
| , | 47635 | 4.9% |
| [ | 44837 | 4.6% |
| ] | 44837 | 4.6% |
| Other values (33) | 316548 |
| Distinct | 22477 |
|---|---|
| Distinct (%) | 50.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 350.4 KiB |
Length
| Max length | 173 |
|---|---|
| Median length | 161 |
| Mean length | 10.078663 |
| Min length | 2 |
Characters and Unicode
| Total characters | 451897 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 5 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 20148 ? |
|---|---|
| Unique (%) | 44.9% |
Sample
| 1st row | [3] |
|---|---|
| 2nd row | [559, 2550, 10201] |
| 3rd row | [6194, 19464] |
| 4th row | [306] |
| 5th row | [5842, 9195] |
| Value | Count | Frequency (%) |
| 11604 | 14.2% | |
| 6194 | 1246 | 1.5% |
| 8411 | 1074 | 1.3% |
| 4 | 1000 | 1.2% |
| 306 | 834 | 1.0% |
| 33 | 820 | 1.0% |
| 441 | 447 | 0.5% |
| 5358 | 432 | 0.5% |
| 5 | 427 | 0.5% |
| 6 | 290 | 0.4% |
| Other values (23501) | 63375 |
Most occurring characters
| Value | Count | Frequency (%) |
| [ | 44837 | |
| ] | 44837 | |
| 1 | 44042 | |
| , | 36712 | 8.1% |
| 36712 | 8.1% | |
| 2 | 32237 | 7.1% |
| 3 | 31035 | 6.9% |
| 4 | 29931 | 6.6% |
| 6 | 27657 | 6.1% |
| 5 | 27368 | 6.1% |
| Other values (4) | 96529 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 288799 | |
| Open Punctuation | 44837 | 9.9% |
| Close Punctuation | 44837 | 9.9% |
| Other Punctuation | 36712 | 8.1% |
| Space Separator | 36712 | 8.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 44042 | |
| 2 | 32237 | |
| 3 | 31035 | |
| 4 | 29931 | |
| 6 | 27657 | |
| 5 | 27368 | |
| 8 | 25441 | |
| 7 | 24143 | |
| 9 | 23909 | |
| 0 | 23036 |
Open Punctuation
| Value | Count | Frequency (%) |
| [ | 44837 |
Close Punctuation
| Value | Count | Frequency (%) |
| ] | 44837 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 36712 |
Space Separator
| Value | Count | Frequency (%) |
| 36712 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 451897 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| [ | 44837 | |
| ] | 44837 | |
| 1 | 44042 | |
| , | 36712 | 8.1% |
| 36712 | 8.1% | |
| 2 | 32237 | 7.1% |
| 3 | 31035 | 6.9% |
| 4 | 29931 | 6.6% |
| 6 | 27657 | 6.1% |
| 5 | 27368 | 6.1% |
| Other values (4) | 96529 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 451897 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| [ | 44837 | |
| ] | 44837 | |
| 1 | 44042 | |
| , | 36712 | 8.1% |
| 36712 | 8.1% | |
| 2 | 32237 | 7.1% |
| 3 | 31035 | 6.9% |
| 4 | 29931 | 6.6% |
| 6 | 27657 | 6.1% |
| 5 | 27368 | 6.1% |
| Other values (4) | 96529 |
Name_cast
Text
| Distinct | 42118 |
|---|---|
| Distinct (%) | 93.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 350.4 KiB |
Length
| Max length | 5179 |
|---|---|
| Median length | 1500 |
| Mean length | 215.10915 |
| Min length | 2 |
Characters and Unicode
| Total characters | 9644849 |
|---|---|
| Distinct characters | 394 |
| Distinct categories | 14 ? |
| Distinct scripts | 9 ? |
| Distinct blocks | 10 ? |
Unique
| Unique | 41917 ? |
|---|---|
| Unique (%) | 93.5% |
Sample
| 1st row | ['Tom Hanks', 'Tim Allen', 'Don Rickles', 'Jim Varney', 'Wallace Shawn', 'John Ratzenberger', 'Annie Potts', 'John Morris', 'Erik von Detten', 'Laurie Metcalf', 'R. Lee Ermey', 'Sarah Freeman', 'Penn Jillette'] |
|---|---|
| 2nd row | ['Robin Williams', 'Jonathan Hyde', 'Kirsten Dunst', 'Bradley Pierce', 'Bonnie Hunt', 'Bebe Neuwirth', 'David Alan Grier', 'Patricia Clarkson', 'Adam Hann-Byrd', 'Laura Bell Bundy', 'James Handy', 'Gillian Barber', 'Brandon Obray', 'Cyrus Thiedeke', 'Gary Joseph Thorup', 'Leonard Zola', 'Lloyd Berry', 'Malcolm Stewart', 'Annabel Kershaw', 'Darryl Henriques', 'Robyn Driscoll', 'Peter Bryant', 'Sarah Gilson', 'Florica Vlad', 'June Lion', 'Brenda Lockmuller'] |
| 3rd row | ['Walter Matthau', 'Jack Lemmon', 'Ann-Margret', 'Sophia Loren', 'Daryl Hannah', 'Burgess Meredith', 'Kevin Pollak'] |
| 4th row | ['Whitney Houston', 'Angela Bassett', 'Loretta Devine', 'Lela Rochon', 'Gregory Hines', 'Dennis Haysbert', 'Michael Beach', 'Mykelti Williamson', 'Lamont Johnson', 'Wesley Snipes'] |
| 5th row | ['Steve Martin', 'Diane Keaton', 'Martin Short', 'Kimberly Williams-Paisley', 'George Newbern', 'Kieran Culkin', 'BD Wong', 'Peter Michael Goetz', 'Kate McGregor-Stewart', 'Jane Adams', 'Eugene Levy', 'Lori Alan'] |
| Value | Count | Frequency (%) |
| john | 9721 | 0.8% |
| michael | 7413 | 0.6% |
| david | 6130 | 0.5% |
| robert | 5689 | 0.5% |
| james | 5637 | 0.5% |
| richard | 4407 | 0.4% |
| paul | 4296 | 0.4% |
| peter | 3866 | 0.3% |
| william | 3402 | 0.3% |
| george | 3399 | 0.3% |
| Other values (112094) | 1102819 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1112000 | 11.5% | |
| ' | 1109776 | 11.5% |
| a | 697990 | 7.2% |
| e | 659483 | 6.8% |
| n | 519392 | 5.4% |
| , | 514860 | 5.3% |
| r | 492893 | 5.1% |
| i | 479189 | 5.0% |
| o | 419776 | 4.4% |
| l | 363213 | 3.8% |
| Other values (384) | 3276277 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5597512 | |
| Other Punctuation | 1651149 | 17.1% |
| Uppercase Letter | 1179686 | 12.2% |
| Space Separator | 1112000 | 11.5% |
| Open Punctuation | 44859 | 0.5% |
| Close Punctuation | 44845 | 0.5% |
| Dash Punctuation | 14012 | 0.1% |
| Other Letter | 543 | < 0.1% |
| Decimal Number | 115 | < 0.1% |
| Final Punctuation | 83 | < 0.1% |
| Other values (4) | 45 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 697990 | |
| e | 659483 | |
| n | 519392 | |
| r | 492893 | 8.8% |
| i | 479189 | 8.6% |
| o | 419776 | 7.5% |
| l | 363213 | 6.5% |
| s | 253368 | 4.5% |
| t | 250955 | 4.5% |
| h | 196075 | 3.5% |
| Other values (138) | 1265178 |
Other Letter
| Value | Count | Frequency (%) |
| ا | 32 | 5.9% |
| م | 31 | 5.7% |
| ع | 19 | 3.5% |
| ی | 19 | 3.5% |
| ن | 18 | 3.3% |
| د | 17 | 3.1% |
| ر | 17 | 3.1% |
| 松 | 17 | 3.1% |
| ي | 16 | 2.9% |
| 美 | 12 | 2.2% |
| Other values (104) | 345 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 108425 | 9.2% |
| S | 91444 | 7.8% |
| C | 83341 | 7.1% |
| J | 82610 | 7.0% |
| B | 81638 | 6.9% |
| A | 70102 | 5.9% |
| R | 66817 | 5.7% |
| D | 65305 | 5.5% |
| L | 60670 | 5.1% |
| G | 54164 | 4.6% |
| Other values (81) | 415170 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 1109776 | |
| , | 514860 | |
| . | 15905 | 1.0% |
| " | 10549 | 0.6% |
| \ | 32 | < 0.1% |
| · | 9 | < 0.1% |
| & | 6 | < 0.1% |
| : | 6 | < 0.1% |
| ! | 5 | < 0.1% |
| / | 1 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 44 | |
| 5 | 37 | |
| 2 | 14 | 12.2% |
| 1 | 8 | 7.0% |
| 9 | 4 | 3.5% |
| 3 | 2 | 1.7% |
| 4 | 2 | 1.7% |
| 7 | 2 | 1.7% |
| 8 | 1 | 0.9% |
| 6 | 1 | 0.9% |
Nonspacing Mark
| Value | Count | Frequency (%) |
| ́ | 10 | |
| ิ | 2 | 11.8% |
| ่ | 1 | 5.9% |
| ึ | 1 | 5.9% |
| ี | 1 | 5.9% |
| ์ | 1 | 5.9% |
| ั | 1 | 5.9% |
Open Punctuation
| Value | Count | Frequency (%) |
| [ | 44837 | |
| „ | 14 | < 0.1% |
| ( | 8 | < 0.1% |
Final Punctuation
| Value | Count | Frequency (%) |
| ’ | 74 | |
| ” | 6 | 7.2% |
| » | 3 | 3.6% |
Close Punctuation
| Value | Count | Frequency (%) |
| ] | 44837 | |
| ) | 8 | < 0.1% |
Initial Punctuation
| Value | Count | Frequency (%) |
| “ | 20 | |
| « | 3 | 13.0% |
Space Separator
| Value | Count | Frequency (%) |
| 1112000 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 14012 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 3 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ´ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6774205 | |
| Common | 2867091 | |
| Cyrillic | 2979 | < 0.1% |
| Han | 276 | < 0.1% |
| Arabic | 241 | < 0.1% |
| Thai | 27 | < 0.1% |
| Greek | 14 | < 0.1% |
| Inherited | 10 | < 0.1% |
| Hangul | 6 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 697990 | 10.3% |
| e | 659483 | 9.7% |
| n | 519392 | 7.7% |
| r | 492893 | 7.3% |
| i | 479189 | 7.1% |
| o | 419776 | 6.2% |
| l | 363213 | 5.4% |
| s | 253368 | 3.7% |
| t | 250955 | 3.7% |
| h | 196075 | 2.9% |
| Other values (163) | 2441871 |
Han
| Value | Count | Frequency (%) |
| 松 | 17 | 6.2% |
| 美 | 12 | 4.3% |
| 长 | 11 | 4.0% |
| 平 | 11 | 4.0% |
| 龙 | 11 | 4.0% |
| 田 | 11 | 4.0% |
| 雅 | 11 | 4.0% |
| 泽 | 11 | 4.0% |
| 杰 | 9 | 3.3% |
| 森 | 9 | 3.3% |
| Other values (55) | 163 |
Cyrillic
| Value | Count | Frequency (%) |
| а | 316 | 10.6% |
| и | 303 | 10.2% |
| о | 227 | 7.6% |
| н | 224 | 7.5% |
| р | 209 | 7.0% |
| е | 169 | 5.7% |
| л | 149 | 5.0% |
| к | 132 | 4.4% |
| т | 114 | 3.8% |
| с | 106 | 3.6% |
| Other values (51) | 1030 |
Common
| Value | Count | Frequency (%) |
| 1112000 | ||
| ' | 1109776 | |
| , | 514860 | |
| [ | 44837 | 1.6% |
| ] | 44837 | 1.6% |
| . | 15905 | 0.6% |
| - | 14012 | 0.5% |
| " | 10549 | 0.4% |
| ’ | 74 | < 0.1% |
| 0 | 44 | < 0.1% |
| Other values (24) | 197 | < 0.1% |
Arabic
| Value | Count | Frequency (%) |
| ا | 32 | |
| م | 31 | |
| ع | 19 | 7.9% |
| ی | 19 | 7.9% |
| ن | 18 | 7.5% |
| د | 17 | 7.1% |
| ر | 17 | 7.1% |
| ي | 16 | 6.6% |
| ل | 9 | 3.7% |
| ب | 8 | 3.3% |
| Other values (18) | 55 |
Thai
| Value | Count | Frequency (%) |
| า | 2 | 7.4% |
| ร | 2 | 7.4% |
| ิ | 2 | 7.4% |
| ง | 2 | 7.4% |
| น | 2 | 7.4% |
| ว | 2 | 7.4% |
| ณ | 1 | 3.7% |
| ภ | 1 | 3.7% |
| ส | 1 | 3.7% |
| โ | 1 | 3.7% |
| Other values (11) | 11 |
Hangul
| Value | Count | Frequency (%) |
| 강 | 1 | |
| 계 | 1 | |
| 열 | 1 | |
| 만 | 1 | |
| 조 | 1 | |
| 병 | 1 |
Greek
| Value | Count | Frequency (%) |
| ν | 6 | |
| Ζ | 2 | 14.3% |
| α | 2 | 14.3% |
| ο | 2 | 14.3% |
| ί | 2 | 14.3% |
Inherited
| Value | Count | Frequency (%) |
| ́ | 10 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9603360 | |
| None | 37780 | 0.4% |
| Cyrillic | 2979 | < 0.1% |
| CJK | 276 | < 0.1% |
| Arabic | 241 | < 0.1% |
| Punctuation | 114 | < 0.1% |
| Latin Ext Additional | 56 | < 0.1% |
| Thai | 27 | < 0.1% |
| Diacriticals | 10 | < 0.1% |
| Hangul | 6 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1112000 | 11.6% | |
| ' | 1109776 | 11.6% |
| a | 697990 | 7.3% |
| e | 659483 | 6.9% |
| n | 519392 | 5.4% |
| , | 514860 | 5.4% |
| r | 492893 | 5.1% |
| i | 479189 | 5.0% |
| o | 419776 | 4.4% |
| l | 363213 | 3.8% |
| Other values (68) | 3234788 |
None
| Value | Count | Frequency (%) |
| é | 8994 | |
| á | 4101 | 10.9% |
| í | 2720 | 7.2% |
| ô | 2319 | 6.1% |
| ö | 1997 | 5.3% |
| ó | 1852 | 4.9% |
| ü | 1475 | 3.9% |
| ć | 1352 | 3.6% |
| è | 1212 | 3.2% |
| ä | 991 | 2.6% |
| Other values (110) | 10767 |
Cyrillic
| Value | Count | Frequency (%) |
| а | 316 | 10.6% |
| и | 303 | 10.2% |
| о | 227 | 7.6% |
| н | 224 | 7.5% |
| р | 209 | 7.0% |
| е | 169 | 5.7% |
| л | 149 | 5.0% |
| к | 132 | 4.4% |
| т | 114 | 3.8% |
| с | 106 | 3.6% |
| Other values (51) | 1030 |
Punctuation
| Value | Count | Frequency (%) |
| ’ | 74 | |
| “ | 20 | 17.5% |
| „ | 14 | 12.3% |
| ” | 6 | 5.3% |
Arabic
| Value | Count | Frequency (%) |
| ا | 32 | |
| م | 31 | |
| ع | 19 | 7.9% |
| ی | 19 | 7.9% |
| ن | 18 | 7.5% |
| د | 17 | 7.1% |
| ر | 17 | 7.1% |
| ي | 16 | 6.6% |
| ل | 9 | 3.7% |
| ب | 8 | 3.3% |
| Other values (18) | 55 |
CJK
| Value | Count | Frequency (%) |
| 松 | 17 | 6.2% |
| 美 | 12 | 4.3% |
| 长 | 11 | 4.0% |
| 平 | 11 | 4.0% |
| 龙 | 11 | 4.0% |
| 田 | 11 | 4.0% |
| 雅 | 11 | 4.0% |
| 泽 | 11 | 4.0% |
| 杰 | 9 | 3.3% |
| 森 | 9 | 3.3% |
| Other values (55) | 163 |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ễ | 15 | |
| ạ | 9 | |
| ị | 6 | 10.7% |
| ỳ | 6 | 10.7% |
| ế | 5 | 8.9% |
| ỗ | 4 | 7.1% |
| ả | 4 | 7.1% |
| ề | 4 | 7.1% |
| ầ | 2 | 3.6% |
| ố | 1 | 1.8% |
Diacriticals
| Value | Count | Frequency (%) |
| ́ | 10 |
Thai
| Value | Count | Frequency (%) |
| า | 2 | 7.4% |
| ร | 2 | 7.4% |
| ิ | 2 | 7.4% |
| ง | 2 | 7.4% |
| น | 2 | 7.4% |
| ว | 2 | 7.4% |
| ณ | 1 | 3.7% |
| ภ | 1 | 3.7% |
| ส | 1 | 3.7% |
| โ | 1 | 3.7% |
| Other values (11) | 11 |
Hangul
| Value | Count | Frequency (%) |
| 강 | 1 | |
| 계 | 1 | |
| 열 | 1 | |
| 만 | 1 | |
| 조 | 1 | |
| 병 | 1 |
Director
Text
| Distinct | 18622 |
|---|---|
| Distinct (%) | 41.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 350.4 KiB |
Length
| Max length | 738 |
|---|---|
| Median length | 530 |
| Mean length | 18.886768 |
| Min length | 2 |
Characters and Unicode
| Total characters | 846826 |
|---|---|
| Distinct characters | 206 |
| Distinct categories | 10 ? |
| Distinct scripts | 6 ? |
| Distinct blocks | 7 ? |
Unique
| Unique | 11971 ? |
|---|---|
| Unique (%) | 26.7% |
Sample
| 1st row | ['John Lasseter'] |
|---|---|
| 2nd row | ['Joe Johnston'] |
| 3rd row | ['Howard Deutch'] |
| 4th row | ['Forest Whitaker'] |
| 5th row | ['Charles Shyer'] |
| Value | Count | Frequency (%) |
| john | 1218 | 1.2% |
| michael | 940 | 0.9% |
| 918 | 0.9% | |
| david | 892 | 0.9% |
| robert | 847 | 0.8% |
| peter | 573 | 0.6% |
| william | 553 | 0.5% |
| richard | 538 | 0.5% |
| james | 521 | 0.5% |
| paul | 466 | 0.5% |
| Other values (18517) | 95145 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 96597 | 11.4% |
| 57786 | 6.8% | |
| e | 56757 | 6.7% |
| a | 56231 | 6.6% |
| ] | 44837 | 5.3% |
| [ | 44837 | 5.3% |
| r | 44156 | 5.2% |
| n | 43746 | 5.2% |
| i | 42466 | 5.0% |
| o | 38282 | 4.5% |
| Other values (196) | 321131 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 489921 | |
| Other Punctuation | 104492 | 12.3% |
| Uppercase Letter | 103542 | 12.2% |
| Space Separator | 57786 | 6.8% |
| Close Punctuation | 44839 | 5.3% |
| Open Punctuation | 44839 | 5.3% |
| Dash Punctuation | 1380 | 0.2% |
| Other Letter | 23 | < 0.1% |
| Decimal Number | 3 | < 0.1% |
| Math Symbol | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 56757 | |
| a | 56231 | |
| r | 44156 | 9.0% |
| n | 43746 | 8.9% |
| i | 42466 | 8.7% |
| o | 38282 | 7.8% |
| l | 29818 | 6.1% |
| s | 22586 | 4.6% |
| t | 21495 | 4.4% |
| h | 18109 | 3.7% |
| Other values (97) | 116275 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 9115 | 8.8% |
| S | 8660 | 8.4% |
| J | 7771 | 7.5% |
| R | 6618 | 6.4% |
| C | 6449 | 6.2% |
| B | 6433 | 6.2% |
| A | 6213 | 6.0% |
| D | 5520 | 5.3% |
| L | 5366 | 5.2% |
| G | 4926 | 4.8% |
| Other values (53) | 36471 |
Other Letter
| Value | Count | Frequency (%) |
| ی | 2 | 8.7% |
| م | 2 | 8.7% |
| ا | 2 | 8.7% |
| 张 | 1 | 4.3% |
| 立 | 1 | 4.3% |
| پ | 1 | 4.3% |
| ن | 1 | 4.3% |
| ع | 1 | 4.3% |
| د | 1 | 4.3% |
| 義 | 1 | 4.3% |
| Other values (10) | 10 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 96597 | |
| , | 4405 | 4.2% |
| . | 3076 | 2.9% |
| " | 402 | 0.4% |
| \ | 11 | < 0.1% |
| · | 1 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 5 | 1 | |
| 9 | 1 |
Close Punctuation
| Value | Count | Frequency (%) |
| ] | 44837 | |
| ) | 2 | < 0.1% |
Open Punctuation
| Value | Count | Frequency (%) |
| [ | 44837 | |
| ( | 2 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 57786 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1380 |
Math Symbol
| Value | Count | Frequency (%) |
| | | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 593275 | |
| Common | 253340 | |
| Cyrillic | 188 | < 0.1% |
| Arabic | 10 | < 0.1% |
| Han | 10 | < 0.1% |
| Hangul | 3 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 56757 | 9.6% |
| a | 56231 | 9.5% |
| r | 44156 | 7.4% |
| n | 43746 | 7.4% |
| i | 42466 | 7.2% |
| o | 38282 | 6.5% |
| l | 29818 | 5.0% |
| s | 22586 | 3.8% |
| t | 21495 | 3.6% |
| h | 18109 | 3.1% |
| Other values (123) | 219629 |
Cyrillic
| Value | Count | Frequency (%) |
| и | 22 | 11.7% |
| о | 15 | 8.0% |
| е | 14 | 7.4% |
| а | 14 | 7.4% |
| р | 13 | 6.9% |
| к | 13 | 6.9% |
| л | 13 | 6.9% |
| н | 11 | 5.9% |
| д | 9 | 4.8% |
| в | 6 | 3.2% |
| Other values (27) | 58 |
Common
| Value | Count | Frequency (%) |
| ' | 96597 | |
| 57786 | ||
| ] | 44837 | |
| [ | 44837 | |
| , | 4405 | 1.7% |
| . | 3076 | 1.2% |
| - | 1380 | 0.5% |
| " | 402 | 0.2% |
| \ | 11 | < 0.1% |
| ) | 2 | < 0.1% |
| Other values (6) | 7 | < 0.1% |
Han
| Value | Count | Frequency (%) |
| 张 | 1 | |
| 立 | 1 | |
| 義 | 1 | |
| 玛 | 1 | |
| 莫 | 1 | |
| 森 | 1 | |
| 杰 | 1 | |
| 塩 | 1 | |
| 谷 | 1 | |
| 直 | 1 |
Arabic
| Value | Count | Frequency (%) |
| ی | 2 | |
| م | 2 | |
| ا | 2 | |
| پ | 1 | |
| ن | 1 | |
| ع | 1 | |
| د | 1 |
Hangul
| Value | Count | Frequency (%) |
| 진 | 1 | |
| 모 | 1 | |
| 영 | 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 842513 | |
| None | 4099 | 0.5% |
| Cyrillic | 188 | < 0.1% |
| Arabic | 10 | < 0.1% |
| CJK | 10 | < 0.1% |
| Latin Ext Additional | 3 | < 0.1% |
| Hangul | 3 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| ' | 96597 | 11.5% |
| 57786 | 6.9% | |
| e | 56757 | 6.7% |
| a | 56231 | 6.7% |
| ] | 44837 | 5.3% |
| [ | 44837 | 5.3% |
| r | 44156 | 5.2% |
| n | 43746 | 5.2% |
| i | 42466 | 5.0% |
| o | 38282 | 4.5% |
| Other values (57) | 316818 |
None
| Value | Count | Frequency (%) |
| é | 968 | |
| á | 406 | 9.9% |
| ö | 268 | 6.5% |
| í | 247 | 6.0% |
| ó | 235 | 5.7% |
| ô | 163 | 4.0% |
| ä | 152 | 3.7% |
| è | 128 | 3.1% |
| ü | 116 | 2.8% |
| ç | 110 | 2.7% |
| Other values (69) | 1306 |
Cyrillic
| Value | Count | Frequency (%) |
| и | 22 | 11.7% |
| о | 15 | 8.0% |
| е | 14 | 7.4% |
| а | 14 | 7.4% |
| р | 13 | 6.9% |
| к | 13 | 6.9% |
| л | 13 | 6.9% |
| н | 11 | 5.9% |
| д | 9 | 4.8% |
| в | 6 | 3.2% |
| Other values (27) | 58 |
Arabic
| Value | Count | Frequency (%) |
| ی | 2 | |
| م | 2 | |
| ا | 2 | |
| پ | 1 | |
| ن | 1 | |
| ع | 1 | |
| د | 1 |
Latin Ext Additional
| Value | Count | Frequency (%) |
| ễ | 1 | |
| ạ | 1 | |
| ấ | 1 |
CJK
| Value | Count | Frequency (%) |
| 张 | 1 | |
| 立 | 1 | |
| 義 | 1 | |
| 玛 | 1 | |
| 莫 | 1 | |
| 森 | 1 | |
| 杰 | 1 | |
| 塩 | 1 | |
| 谷 | 1 | |
| 直 | 1 |
Hangul
| Value | Count | Frequency (%) |
| 진 | 1 | |
| 모 | 1 | |
| 영 | 1 |
release_month
Real number (ℝ)
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.4642371 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 350.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 7 |
| Q3 | 10 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 3.627512 |
|---|---|
| Coefficient of variation (CV) | 0.5611663 |
| Kurtosis | -1.3242628 |
| Mean | 6.4642371 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.073011831 |
| Sum | 289837 |
| Variance | 13.158843 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 5816 | |
| 9 | 4786 | |
| 10 | 4559 | |
| 12 | 3754 | |
| 11 | 3622 | |
| 3 | 3512 | |
| 4 | 3411 | |
| 8 | 3359 | |
| 5 | 3313 | |
| 6 | 3105 | |
| Other values (2) | 5600 |
| Value | Count | Frequency (%) |
| 1 | 5816 | |
| 2 | 2996 | |
| 3 | 3512 | |
| 4 | 3411 | |
| 5 | 3313 | |
| 6 | 3105 | |
| 7 | 2604 | |
| 8 | 3359 | |
| 9 | 4786 | |
| 10 | 4559 |
| Value | Count | Frequency (%) |
| 12 | 3754 | |
| 11 | 3622 | |
| 10 | 4559 | |
| 9 | 4786 | |
| 8 | 3359 | |
| 7 | 2604 | |
| 6 | 3105 | |
| 5 | 3313 | |
| 4 | 3411 | |
| 3 | 3512 |
release_year
Real number (ℝ)
| Distinct | 135 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1991.8344 |
| Minimum | 1874 |
|---|---|
| Maximum | 2020 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 350.4 KiB |
Quantile statistics
| Minimum | 1874 |
|---|---|
| 5-th percentile | 1941 |
| Q1 | 1978 |
| median | 2001 |
| Q3 | 2010 |
| 95-th percentile | 2015 |
| Maximum | 2020 |
| Range | 146 |
| Interquartile range (IQR) | 32 |
Descriptive statistics
| Standard deviation | 24.005446 |
|---|---|
| Coefficient of variation (CV) | 0.012051929 |
| Kurtosis | 0.79577462 |
| Mean | 1991.8344 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -1.2142281 |
| Sum | 89307879 |
| Variance | 576.26143 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2014 | 1953 | 4.4% |
| 2015 | 1889 | 4.2% |
| 2013 | 1875 | 4.2% |
| 2012 | 1710 | 3.8% |
| 2011 | 1651 | 3.7% |
| 2009 | 1574 | 3.5% |
| 2016 | 1560 | 3.5% |
| 2010 | 1477 | 3.3% |
| 2008 | 1453 | 3.2% |
| 2007 | 1305 | 2.9% |
| Other values (125) | 28390 |
| Value | Count | Frequency (%) |
| 1874 | 1 | < 0.1% |
| 1878 | 1 | < 0.1% |
| 1883 | 1 | < 0.1% |
| 1887 | 1 | < 0.1% |
| 1888 | 2 | < 0.1% |
| 1890 | 5 | |
| 1891 | 6 | |
| 1892 | 3 | < 0.1% |
| 1893 | 1 | < 0.1% |
| 1894 | 12 |
| Value | Count | Frequency (%) |
| 2020 | 1 | < 0.1% |
| 2018 | 5 | < 0.1% |
| 2017 | 451 | 1.0% |
| 2016 | 1560 | |
| 2015 | 1889 | |
| 2014 | 1953 | |
| 2013 | 1875 | |
| 2012 | 1710 | |
| 2011 | 1651 | |
| 2010 | 1477 |
| id | budget | popularity | runtime | vote_average | vote_count | id_collection | release_month | release_year | |
|---|---|---|---|---|---|---|---|---|---|
| id | 1.000 | -0.256 | -0.412 | -0.204 | -0.150 | -0.434 | 0.221 | -0.014 | 0.388 |
| budget | -0.256 | 1.000 | 0.465 | 0.227 | 0.072 | 0.486 | -0.129 | 0.046 | 0.143 |
| popularity | -0.412 | 0.465 | 1.000 | 0.306 | 0.241 | 0.894 | -0.154 | 0.072 | 0.186 |
| runtime | -0.204 | 0.227 | 0.306 | 1.000 | 0.195 | 0.290 | -0.102 | 0.071 | 0.034 |
| vote_average | -0.150 | 0.072 | 0.241 | 0.195 | 1.000 | 0.317 | -0.053 | 0.048 | -0.009 |
| vote_count | -0.434 | 0.486 | 0.894 | 0.290 | 0.317 | 1.000 | -0.154 | 0.064 | 0.200 |
| id_collection | 0.221 | -0.129 | -0.154 | -0.102 | -0.053 | -0.154 | 1.000 | -0.025 | 0.116 |
| release_month | -0.014 | 0.046 | 0.072 | 0.071 | 0.048 | 0.064 | -0.025 | 1.000 | -0.015 |
| release_year | 0.388 | 0.143 | 0.186 | 0.034 | -0.009 | 0.200 | 0.116 | -0.015 | 1.000 |
| id | title | budget | original_language | popularity | runtime | vote_average | vote_count | id_collection | name_genres | Id_production_companies | Name_cast | Director | release_month | release_year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 862 | Toy Story | 30000000.0 | en | 21.946943 | 81.0 | 7.7 | 5415.0 | 10194.0 | ['Animation', 'Comedy', 'Family'] | [3] | ['Tom Hanks', 'Tim Allen', 'Don Rickles', 'Jim Varney', 'Wallace Shawn', 'John Ratzenberger', 'Annie Potts', 'John Morris', 'Erik von Detten', 'Laurie Metcalf', 'R. Lee Ermey', 'Sarah Freeman', 'Penn Jillette'] | ['John Lasseter'] | 10 | 1995 |
| 1 | 8844 | Jumanji | 65000000.0 | en | 17.015539 | 104.0 | 6.9 | 2413.0 | NaN | ['Adventure', 'Fantasy', 'Family'] | [559, 2550, 10201] | ['Robin Williams', 'Jonathan Hyde', 'Kirsten Dunst', 'Bradley Pierce', 'Bonnie Hunt', 'Bebe Neuwirth', 'David Alan Grier', 'Patricia Clarkson', 'Adam Hann-Byrd', 'Laura Bell Bundy', 'James Handy', 'Gillian Barber', 'Brandon Obray', 'Cyrus Thiedeke', 'Gary Joseph Thorup', 'Leonard Zola', 'Lloyd Berry', 'Malcolm Stewart', 'Annabel Kershaw', 'Darryl Henriques', 'Robyn Driscoll', 'Peter Bryant', 'Sarah Gilson', 'Florica Vlad', 'June Lion', 'Brenda Lockmuller'] | ['Joe Johnston'] | 12 | 1995 |
| 2 | 15602 | Grumpier Old Men | 0.0 | en | 11.712900 | 101.0 | 6.5 | 92.0 | 119050.0 | ['Romance', 'Comedy'] | [6194, 19464] | ['Walter Matthau', 'Jack Lemmon', 'Ann-Margret', 'Sophia Loren', 'Daryl Hannah', 'Burgess Meredith', 'Kevin Pollak'] | ['Howard Deutch'] | 12 | 1995 |
| 3 | 31357 | Waiting to Exhale | 16000000.0 | en | 3.859495 | 127.0 | 6.1 | 34.0 | NaN | ['Comedy', 'Drama', 'Romance'] | [306] | ['Whitney Houston', 'Angela Bassett', 'Loretta Devine', 'Lela Rochon', 'Gregory Hines', 'Dennis Haysbert', 'Michael Beach', 'Mykelti Williamson', 'Lamont Johnson', 'Wesley Snipes'] | ['Forest Whitaker'] | 12 | 1995 |
| 4 | 11862 | Father of the Bride Part II | 0.0 | en | 8.387519 | 106.0 | 5.7 | 173.0 | 96871.0 | ['Comedy'] | [5842, 9195] | ['Steve Martin', 'Diane Keaton', 'Martin Short', 'Kimberly Williams-Paisley', 'George Newbern', 'Kieran Culkin', 'BD Wong', 'Peter Michael Goetz', 'Kate McGregor-Stewart', 'Jane Adams', 'Eugene Levy', 'Lori Alan'] | ['Charles Shyer'] | 2 | 1995 |
| 5 | 949 | Heat | 60000000.0 | en | 17.924927 | 170.0 | 7.7 | 1886.0 | NaN | ['Action', 'Crime', 'Drama', 'Thriller'] | [508, 675, 6194] | ['Al Pacino', 'Robert De Niro', 'Val Kilmer', 'Jon Voight', 'Tom Sizemore', 'Diane Venora', 'Amy Brenneman', 'Ashley Judd', 'Mykelti Williamson', 'Natalie Portman', 'Ted Levine', 'Tom Noonan', 'Tone Loc', 'Hank Azaria', 'Wes Studi', 'Dennis Haysbert', 'Danny Trejo', 'Henry Rollins', 'William Fichtner', 'Kevin Gage', 'Susan Traylor', 'Jerry Trimble', 'Ricky Harris', 'Jeremy Piven', 'Xander Berkeley', 'Begonya Plaza', 'Rick Avery', 'Hazelle Goodman', 'Ray Buktenica', 'Max Daniels', 'Vince Deadrick Jr.', 'Steven Ford', 'Farrah Forke', 'Patricia Healy', 'Paul Herman', 'Cindy Katz', 'Brian Libby', 'Dan Martin', 'Mario Roberts', 'Thomas Rosales, Jr.', 'Yvonne Zima', 'Mick Gould', 'Bud Cort', 'Viviane Vives', 'Kim Staunton', 'Martin Ferrero', 'Brad Baldridge', 'Andrew Camuccio', 'Kenny Endoso', 'Kimberly Flynn', 'Niki Harris', 'Bill McIntosh', 'Rick Marzan', 'Terry Miller', "Daniel O'Haco", 'Kai Soremekun', 'Peter Blackwell', 'Trevor Coppola', 'Mary Kircher', 'Darin Mangan', 'Robert Miranda', 'Manny Perry', 'Iva Franks Singer', 'Tim Werner', 'Philip Ettington'] | ['Michael Mann'] | 12 | 1995 |
| 6 | 11860 | Sabrina | 58000000.0 | en | 6.677277 | 127.0 | 6.2 | 141.0 | NaN | ['Comedy', 'Romance'] | [4, 258, 932, 5842, 14941, 55873, 58079] | ['Harrison Ford', 'Julia Ormond', 'Greg Kinnear', 'Angie Dickinson', 'Nancy Marchand', 'John Wood', 'Richard Crenna', 'Lauren Holly', 'Dana Ivey', 'Fanny Ardant', 'Patrick Bruel', 'Paul Giamatti', 'Miriam Colón', 'Elizabeth Franz', 'Valérie Lemercier', 'Becky Ann Baker', 'John C. Vennema', 'Margo Martindale', 'J. Smith-Cameron', 'Christine Luneau-Lipton', 'Michael Dees', 'Denis Holmes', 'Jo-Jo Lowe', 'Ira Wheeler', 'Philippa Cooper', 'Ayako Kawahara', 'François Genty', 'Guillaume Gallienne', 'Inés Sastre', 'Phina Oruche', 'Andrea Behalikova', 'Jennifer Herrera', 'Kristina Kumlin', 'Eva Linderholm', 'Carmen Chaplin', 'Micheline Van de Velde', 'Joanna Rhodes', 'Alan Boone', 'Patrick Forster-Delmas', 'Kentaro Matsuo', 'Peter McKernan', 'Ed Connelly', 'Ronald L. Schwary', 'Alvin Lum', 'Siching Song', 'Phil Nee', 'Randy Becker', 'Susan Browning', 'Anthony Mondal', 'Peter Parks', 'Woodrow Asai', 'Eric Bruno Borgman', 'Michael Cline', 'Christopher Del Gaudio', 'Philippe Hartmann', 'Jerry Quinn', 'Dori Rosenthal'] | ['Sydney Pollack'] | 12 | 1995 |
| 7 | 45325 | Tom and Huck | 0.0 | en | 2.561161 | 97.0 | 5.4 | 45.0 | NaN | ['Action', 'Adventure', 'Drama', 'Family'] | [2] | ['Jonathan Taylor Thomas', 'Brad Renfro', 'Rachael Leigh Cook', 'Michael McShane', 'Amy Wright', 'Eric Schweig', 'Tamara Mello'] | ['Peter Hewitt'] | 12 | 1995 |
| 8 | 9091 | Sudden Death | 35000000.0 | en | 5.231580 | 106.0 | 5.5 | 174.0 | NaN | ['Action', 'Adventure', 'Thriller'] | [33, 21437, 23770] | ['Jean-Claude Van Damme', 'Powers Boothe', 'Dorian Harewood', 'Raymond J. Barry', 'Ross Malinger', 'Whittni Wright'] | ['Peter Hyams'] | 12 | 1995 |
| 9 | 710 | GoldenEye | 58000000.0 | en | 14.686036 | 130.0 | 6.6 | 1194.0 | 645.0 | ['Adventure', 'Action', 'Thriller'] | [60, 7576] | ['Pierce Brosnan', 'Sean Bean', 'Izabella Scorupco', 'Famke Janssen', 'Joe Don Baker', 'Judi Dench', 'Gottfried John', 'Robbie Coltrane', 'Alan Cumming', 'Tchéky Karyo', 'Desmond Llewelyn', 'Samantha Bond', 'Michael Kitchen', 'Serena Gordon', 'Simon Kunz', 'Billy J. Mitchell', 'Constantine Gregory', 'Minnie Driver', 'Michelle Arthur', 'Ravil Isyanov'] | ['Martin Campbell'] | 11 | 1995 |
| id | title | budget | original_language | popularity | runtime | vote_average | vote_count | id_collection | name_genres | Id_production_companies | Name_cast | Director | release_month | release_year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 44827 | 459928 | 12 Feet Deep | 0.0 | en | 4.479536 | 85.0 | 5.1 | 62.0 | NaN | ['Animation', 'Music'] | [] | ['Konsta Hietanen', 'Risto Tuorila', 'Jarmo Mäkinen', 'Antti Virmavirta', 'Kristiina Halttu', 'Rauno Ahonen'] | ['Matt Eskandari'] | 1 | 2016 |
| 44828 | 258514 | Van Gogh: Painted with Words | 0.0 | en | 2.071003 | 80.0 | 6.7 | 11.0 | NaN | ['Horror', 'Fantasy'] | [5062] | ['Kathy Bates', 'Victor Garber', 'Alan Cumming', 'Audra McDonald', 'Kristin Chenoweth', 'Erin Adams', 'Sarah Hyland', 'Lalaine', 'Nanea Miyata', 'Marissa Rago', 'Danelle Wilson', 'Andrea McArdle', 'Alicia Morton', 'Dennis Howard', 'Douglas Fisher', 'Kurt Knudson', 'Brooks Almy', 'Ruth Gottschall', 'Tom Billett', 'Frank Cavestani', 'Ellen Gerstein', 'David Pevsner', 'Ed Francis Martin', 'Bob Morrisey'] | ['Andrew Hutton'] | 4 | 2010 |
| 44829 | 382455 | Don't Call Me Son | 0.0 | pt | 0.552199 | 82.0 | 6.7 | 13.0 | NaN | ['Action', 'Crime', 'History'] | [670, 8797, 10470, 10471] | ['Nicolas Cage', 'Gina Gershon', 'Nicky Whelan', 'Faye Dunaway', 'Natalie Nelson', 'James Van Patten', 'Jonathan Baker', 'Leah Huebner', 'Ele Bardha', 'Corrie Danieley'] | ['Anna Muylaert'] | 7 | 2016 |
| 44830 | 217917 | The Wrong Road | 0.0 | en | 0.316432 | 62.0 | 5.0 | 1.0 | NaN | ['Documentary'] | [73685, 89911] | ['Pirkka-Pekka Petelius', 'Paavo Kerosuo', 'Pekka Strang', 'Johanna af Schultén', 'Cecilia Paul', 'Emil Lundberg', 'Peter Franzén', 'Alexander Skarsgård'] | ['James Cruze'] | 10 | 1937 |
| 44831 | 42616 | The Virginian | 0.0 | en | 0.037264 | 91.0 | 8.0 | 1.0 | NaN | ['Adventure', 'Drama', 'Family'] | [] | ['Christopher Plummer', 'Tom Bosley', 'Bob Elliott', 'Ray Goulding', 'Frank Gorshin', 'Tony Randall'] | ['Victor Fleming'] | 11 | 1929 |
| 44832 | 74384 | San Giovanni decollato | 0.0 | it | 0.400952 | 0.0 | 5.7 | 3.0 | NaN | ['Music', 'Family', 'Comedy'] | [] | ['Patrick Huard', 'Colm Feore', 'Erik Knudsen', 'Noam Jenkins', 'Sarah-Jeanne Labrosse', 'Lucie Laurier', 'Andre Bedard'] | ['Amleto Palermi', 'Giorgio Bianchi'] | 12 | 1940 |
| 44833 | 64043 | I due orfanelli | 0.0 | it | 0.199214 | 0.0 | 3.5 | 4.0 | NaN | ['Thriller', 'Drama'] | [7177, 68273] | ['Gloria Blondell'] | ['Mario Mattoli'] | 1 | 1947 |
| 44834 | 70207 | The Crooked E: The Unshredded Truth About Enron | 0.0 | en | 0.085047 | 100.0 | 2.5 | 2.0 | NaN | [] | [3166] | ['Antonio Banderas', 'Ben Kingsley', 'Liam McIntyre', 'Chad Lindberg', 'Gabriella Wright', 'Cung Le', 'Mark Smith', 'Bashar Rahal', 'Yana Marinova', 'Jiro Wang', 'Ivailo Dimitrov', 'Velimir Velev', 'Mark Basnight', 'Lillian Blankenship', 'Katherine de la Rocha', 'Shari Watson'] | ['Penelope Spheeris'] | 1 | 2003 |
| 44835 | 29458 | One Hundred Steps | 0.0 | it | 4.675250 | 114.0 | 7.8 | 116.0 | 104774.0 | ['Animation', 'Family'] | [25473, 74795] | ['Steve John Shepherd', 'Ben Waters', 'Alec Newman', 'Chiwetel Ejiofor', 'Anjela Lauren Smith', 'Melanie Gutteridge', 'Georgia Mackenzie', 'Alicya Eyo', 'Freddie Annobil-Dodoo', 'Alun Armstrong', 'John Blundell', 'Karl Collins', 'Robbie Gee'] | ['Marco Tullio Giordana'] | 8 | 2000 |
| 44836 | 160788 | Color of the Ocean | 0.0 | de | 0.119252 | 95.0 | 3.0 | 2.0 | 417491.0 | ['Crime', 'Comedy', 'Action'] | [8833] | [] | ['Maggie Peren'] | 3 | 2012 |